[TritonIntelGPUToLLVM] Detect basic sub-group shuffle convert_layout cases #2531

victor-eds · 2024-10-22T12:20:14Z

Detect basic shuffles and lower to gpu.shuffle operations. Basically, support cases in which we go from each work-item having a single tensor element to having sub_group_size tensor elements such as element i corresponds to the element originally held by work-item i in the sub-group.

Upstream MLIR pass should handle all integer and floating point types. Drop code handling type legalization for such types when done. Pointer type should still be done in this project.

Code should be extended to support other kind of shuffles.

Multi-sub-group case not yet implemented.

…cases Detect basic shuffles and lower to `gpu.shuffle` operations. Basically, support cases in which we go from each work-item having a single tensor element to having `sub_group_size` tensor elements such as element `i` corresponds to the element originally held by work-item `i` in the sub-group. Upstream MLIR pass should handle all integer and floating point types. Drop code handling type legalization for such types when done. Pointer type should still be done in this project. Code should be extended to support other kind of shuffles. Multi-warp case not yet implemented. Signed-off-by: victor-eds <[email protected]>

victor-eds · 2024-10-22T12:58:31Z

Part of #2266.

chengjunlu · 2024-10-23T00:08:53Z

Is it urgent for the OKR performance?
If it was not urgent, Could we update this change to the upstream Triton first and then pull it to downstream?

victor-eds · 2024-10-23T08:08:38Z

Is it urgent for the OKR performance? If it was not urgent, Could we update this change to the upstream Triton first and then pull it to downstream?

I agree this should be upstreamed. However, it isn't generic enough IMO. I would like to work more on this before upstreaming. I think I would rather have this merged here and have a generic version upstreamed. WDYT?

test/Conversion/intel/sub-group-shuffle.mlir

chengjunlu · 2024-10-24T00:49:10Z

Is it urgent for the OKR performance? If it was not urgent, Could we update this change to the upstream Triton first and then pull it to downstream?

I agree this should be upstreamed. However, it isn't generic enough IMO. I would like to work more on this before upstreaming. I think I would rather have this merged here and have a generic version upstreamed. WDYT?

Make sense. We can make it general gradually in down stream first.

victor-eds requested review from a team, Dewei-Wang-sh, chengjunlu, etiotto and whitneywhtsang October 22, 2024 12:20

victor-eds self-assigned this Oct 22, 2024

Merge branch 'main' into sub-group-shuffle-broadcast

2512ab6

etiotto linked an issue Oct 23, 2024 that may be closed by this pull request

Port "sub-group transpose reduction" to default path #2266

Closed

Merge branch 'main' into sub-group-shuffle-broadcast

0372175

etiotto approved these changes Oct 23, 2024

View reviewed changes

test/Conversion/intel/sub-group-shuffle.mlir Outdated Show resolved Hide resolved

chengjunlu approved these changes Oct 24, 2024

View reviewed changes

jopperm approved these changes Oct 24, 2024

View reviewed changes

victor-eds added 3 commits October 24, 2024 12:06

Merge branch 'main' into sub-group-shuffle-broadcast

4c31cdc

Drop not needed attrs

67fe7b1

Merge branch 'main' into sub-group-shuffle-broadcast

b04143d

victor-eds enabled auto-merge (squash) October 24, 2024 10:54

victor-eds disabled auto-merge October 24, 2024 10:57

Remove whitespace

b5baa69

victor-eds enabled auto-merge (squash) October 24, 2024 10:59

victor-eds disabled auto-merge October 24, 2024 10:59

victor-eds enabled auto-merge (squash) October 24, 2024 11:38

victor-eds merged commit 6647f59 into intel:main Oct 24, 2024
4 checks passed

victor-eds mentioned this pull request Oct 24, 2024

Detect sub-group shuffles with sliced dimensions #2555

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TritonIntelGPUToLLVM] Detect basic sub-group shuffle convert_layout cases #2531

[TritonIntelGPUToLLVM] Detect basic sub-group shuffle convert_layout cases #2531

Uh oh!

victor-eds commented Oct 22, 2024

Uh oh!

victor-eds commented Oct 22, 2024

Uh oh!

chengjunlu commented Oct 23, 2024

Uh oh!

victor-eds commented Oct 23, 2024

Uh oh!

Uh oh!

chengjunlu commented Oct 24, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[TritonIntelGPUToLLVM] Detect basic sub-group shuffle convert_layout cases #2531

[TritonIntelGPUToLLVM] Detect basic sub-group shuffle convert_layout cases #2531

Uh oh!

Conversation

victor-eds commented Oct 22, 2024

Uh oh!

victor-eds commented Oct 22, 2024

Uh oh!

chengjunlu commented Oct 23, 2024

Uh oh!

victor-eds commented Oct 23, 2024

Uh oh!

Uh oh!

chengjunlu commented Oct 24, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants